Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure Cosmos DB for MongoDB extension #329

Closed

Conversation

aayush3011
Copy link

@aayush3011 aayush3011 commented Feb 28, 2024

Motivation and Context (Why the change? What's the scenario?)

I have added Azure Cosmos DB Mongo vCore as Memory to be used by customers using Kernel Memory and Cosmos DB for their LLM applications.

https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search

@aayush3011 aayush3011 marked this pull request as ready for review March 1, 2024 15:34
@dluc dluc force-pushed the users/akataria/cosmosDBMongoVCore branch from 68c95e6 to 31fad13 Compare March 21, 2024 22:45
Copy link
Collaborator

@dluc dluc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the new projects to the solution file. I see a lot of build warnings and errors due to code style, could you take a look?

@aayush3011
Copy link
Author

@dluc I have added the projects to the Solution file, and renamed the AzureCosmosDBMongoDBVCore to AzureCosmosDBMongoDB as decided by the team for the external connectors/integrations.

@dluc dluc force-pushed the users/akataria/cosmosDBMongoVCore branch from 3fa2e2b to 1f6a2cd Compare April 17, 2024 22:24
@dluc
Copy link
Collaborator

dluc commented Apr 17, 2024

I fixed some compilation errors, but there are still several warnings to address:

extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(35,74): warning CS1570: XML comment has badly formed XML -- 'End tag 'parama' does not match the start tag 'param'.' [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(122,59): warning CS8425: Async-iterator 'AzureCosmosDBMongoDBMemory.GetSimilarListAsync(string, string, ICollection<MemoryFilter>?, double, int, bool, CancellationToken)' has one or more parameters of type 'CancellationToken' but none of them is decorated with the 'EnumeratorCancellation' attribute, so the cancellation token parameter from the generated 'IAsyncEnumerable<>.GetAsyncEnumerator' will be unconsumed [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(163,49): warning CS8425: Async-iterator 'AzureCosmosDBMongoDBMemory.GetListAsync(string, ICollection<MemoryFilter>?, int, bool, CancellationToken)' has one or more parameters of type 'CancellationToken' but none of them is decorated with the 'EnumeratorCancellation' attribute, so the cancellation token parameter from the generated 'IAsyncEnumerable<>.GetAsyncEnumerator' will be unconsumed [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBConfig.cs(74,12): warning CS8618: Non-nullable property 'ConnectionString' must contain a non-null value when exiting constructor. Consider declaring the property as nullable. [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemoryRecord.cs(46,41): warning CS8604: Possible null reference argument for parameter 'value' in 'void Dictionary<string, object>.Add(string key, object value)'. [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemoryRecord.cs(21,20): warning CS8618: Non-nullable property 'Embedding' must contain a non-null value when exiting constructor. Consider declaring the property as nullable. [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(135,35): warning CS8600: Converting null literal or possible null value to non-nullable type. [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(163,49): warning CS1998: This async method lacks 'await' operators and will run synchronously. Consider using the 'await' operator to await non-blocking API calls, or 'await Task.Run(...)' to do CPU-bound work on a background thread. [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemoryRecord.cs(21,20): warning CA1819: Properties should not return arrays (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca1819) [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBVectorSearchType.cs(9,5): warning CA1707: Remove the underscores from member name Microsoft.KernelMemory.MemoryDb.AzureCosmosDBMongoDB.AzureCosmosDBVectorSearchType.vector_ivf (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca1707) [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBVectorSearchType.cs(10,5): warning CA1707: Remove the underscores from member name Microsoft.KernelMemory.MemoryDb.AzureCosmosDBMongoDB.AzureCosmosDBVectorSearchType.vector_hnsw (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca1707) [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(73,47): warning CA1725: In member Task AzureCosmosDBMongoDBMemory.CreateIndexAsync(string indexName, int vectorSize, CancellationToken cancellationToken = default(CancellationToken)), change parameter name indexName to index in order to match the identifier as it has been declared in Task IMemoryDb.CreateIndexAsync(string index, int vectorSize, CancellationToken cancellationToken = default(CancellationToken)) (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca1725) [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(75,29): warning CA2016: Forward the 'cancellationToken' parameter to the 'ListAsync' method or pass in 'CancellationToken.None' explicitly to indicate intentionally not propagating the token (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca2016) [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(76,14): warning CA2016: Forward the 'cancellationToken' parameter to the 'ToList' method or pass in 'CancellationToken.None' explicitly to indicate intentionally not propagating the token (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca2016) [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(89,19): warning CA2016: Forward the 'cancellationToken' parameter to the 'RunCommandAsync' method or pass in 'CancellationToken.None' explicitly to indicate intentionally not propagating the token (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca2016) [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(105,15): warning CA2016: Forward the 'cancellationToken' parameter to the 'DropOneAsync' method or pass in 'CancellationToken.None' explicitly to indicate intentionally not propagating the token (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca2016) [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(146,34): warning CA2016: Forward the 'cancellationToken' parameter to the 'AggregateAsync' method or pass in 'CancellationToken.None' explicitly to indicate intentionally not propagating the token (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca2016) [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(249,92): warning CA1305: The behavior of 'float.ToString()' could vary based on the current user's locale settings. Replace this call in 'AzureCosmosDBMongoDBMemory.GetVectorIVFSearchPipeline(Embedding, int)' with a call to 'float.ToString(IFormatProvider)'. (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca1305) [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(279,92): warning CA1305: The behavior of 'float.ToString()' could vary based on the current user's locale settings. Replace this call in 'AzureCosmosDBMongoDBMemory.GetVectorHNSWSearchPipeline(Embedding, int)' with a call to 'float.ToString(IFormatProvider)'. (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca1305) [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]
extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDBMemory.cs(27,35): warning CA1859: Change type of field '_cosmosDBMongoClient' from 'MongoDB.Driver.IMongoClient' to 'MongoDB.Driver.MongoClient' for improved performance (https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca1859) [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]

I'm also getting a compilation error:

/usr/local/share/dotnet/sdk/8.0.201/Sdks/NuGet.Build.Tasks.Pack/build/NuGet.Build.Tasks.Pack.targets(221,5): error NU5019: File not found: 'extensions/AzureCosmosDBMongoDB/README.md'. [extensions/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB/AzureCosmosDBMongoDB.csproj]

@dluc
Copy link
Collaborator

dluc commented Apr 17, 2024

Could you please also squash the commits into a single commit? that will make it easier to manage the PR while other PRs get merged. Thanks!

Copy link
Collaborator

@dluc dluc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build issues to be addressed before testing - thanks

@dluc dluc added waiting for author Waiting for author to reply or address comments and removed ready for review labels Apr 17, 2024
Changes for Azure CosmosDB Mongo vCore

Adding some comments

Formatting

Formatting

Resovling comments

Resovling comments

Resolving comments

Adding changes for Azure cosmos mongo vcore memory store

Changes for Azure CosmosDB Mongo vCore

Adding some comments

Formatting

Formatting

Resovling comments

Resovling comments

Resolving comments

Misc fixes

Formatting changes

Fixing build warning

Fixing build warning
@aayush3011 aayush3011 force-pushed the users/akataria/cosmosDBMongoVCore branch from ad2002c to e95cfc3 Compare April 19, 2024 02:56
@aayush3011
Copy link
Author

@dluc I have addressed all your comments, the build should also be passing now. Let me know if something else is required from my side.

Copy link

@Josephrp Josephrp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fantastic effort, thanks for following through

@dluc dluc force-pushed the users/akataria/cosmosDBMongoVCore branch from 6806903 to ee5d060 Compare April 22, 2024 18:39
@dluc dluc changed the title Adding changes for Azure cosmos mongo vcore memory store Azure Cosmos DB for MongoDB extension Apr 22, 2024
dluc added 2 commits April 22, 2024 12:50
Adding changes for Azure cosmos mongo vcore memory store
Changes for Azure CosmosDB Mongo vCore
Adding some comments
Formatting
Formatting
Resovling comments
Resovling comments
Resolving comments
Adding changes for Azure cosmos mongo vcore memory store
Changes for Azure CosmosDB Mongo vCore
Adding some comments
Formatting
Formatting
Resovling comments
Resovling comments
Resolving comments
Misc fixes
Formatting changes
Fixing build warning
Fixing build warning
@dluc dluc force-pushed the users/akataria/cosmosDBMongoVCore branch from ee5d060 to ccced1d Compare April 22, 2024 19:51

namespace AzureCosmosDBMongoDB.FunctionalTests;

public class DefaultTests : BaseFunctionalTestCase
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please run the tests and make sure they are all passing

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the mongo vCore implementation you cannot have indexes with same properties and different names. The test cases in this class create multiple indexes with just different names but same index properties, and they will always fail in our case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's stopping from having multiple indexes with the same schema?

Copy link
Collaborator

@dluc dluc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major things to be addressed:

  • Functional tests should pass
  • Filtering should be implemented
  • Remove Newtonsoft dependency

@dluc dluc added work in progress and removed waiting for author Waiting for author to reply or address comments labels Apr 24, 2024
@dluc
Copy link
Collaborator

dluc commented May 11, 2024

@aayush3011 how is it going with functional tests and filters? Did you consider reusing the logic from the MongoDbAtlas connector?

@dluc dluc added the waiting for author Waiting for author to reply or address comments label May 11, 2024
@dluc
Copy link
Collaborator

dluc commented May 17, 2024

Looks like the PR got stale, closing it for now. We can reopen it if someone can complete the connector and address the issues identified.

@dluc dluc closed this May 17, 2024
@dluc dluc reopened this May 23, 2024
@aayush3011
Copy link
Author

@dluc I have resolved most of the comments, please take another look at the PR once you get a chance.

@aayush3011
Copy link
Author

@dluc any updates on this?

@dluc
Copy link
Collaborator

dluc commented Jul 27, 2024

Closing for now. We'll reconsider once the new SK connector is ready.

@dluc dluc closed this Jul 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for author Waiting for author to reply or address comments work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants